5 research outputs found
Transformer-based Planning for Symbolic Regression
Symbolic regression (SR) is a challenging task in machine learning that
involves finding a mathematical expression for a function based on its values.
Recent advancements in SR have demonstrated the effectiveness of pretrained
transformer-based models in generating equations as sequences, leveraging
large-scale pretraining on synthetic datasets and offering notable advantages
in terms of inference time over GP-based methods. However, these models
primarily rely on supervised pretraining goals borrowed from text generation
and overlook equation-specific objectives like accuracy and complexity. To
address this, we propose TPSR, a Transformer-based Planning strategy for
Symbolic Regression that incorporates Monte Carlo Tree Search into the
transformer decoding process. Unlike conventional decoding strategies, TPSR
enables the integration of non-differentiable feedback, such as fitting
accuracy and complexity, as external sources of knowledge into the
transformer-based equation generation process. Extensive experiments on various
datasets show that our approach outperforms state-of-the-art methods, enhancing
the model's fitting-complexity trade-off, extrapolation abilities, and
robustness to noiseComment: Parshin Shojaee and Kazem Meidani contributed equally to this wor
SNIP: Bridging Mathematical Symbolic and Numeric Realms with Unified Pre-training
In an era where symbolic mathematical equations are indispensable for
modeling complex natural phenomena, scientific inquiry often involves
collecting observations and translating them into mathematical expressions.
Recently, deep learning has emerged as a powerful tool for extracting insights
from data. However, existing models typically specialize in either numeric or
symbolic domains, and are usually trained in a supervised manner tailored to
specific tasks. This approach neglects the substantial benefits that could
arise from a task-agnostic unified understanding between symbolic equations and
their numeric counterparts. To bridge the gap, we introduce SNIP, a
Symbolic-Numeric Integrated Pre-training, which employs joint contrastive
learning between symbolic and numeric domains, enhancing their mutual
similarities in the pre-trained embeddings. By performing latent space
analysis, we observe that SNIP provides cross-domain insights into the
representations, revealing that symbolic supervision enhances the embeddings of
numeric data and vice versa. We evaluate SNIP across diverse tasks, including
symbolic-to-numeric mathematical property prediction and numeric-to-symbolic
equation discovery, commonly known as symbolic regression. Results show that
SNIP effectively transfers to various tasks, consistently outperforming fully
supervised baselines and competing strongly with established task-specific
methods, especially in few-shot learning scenarios where available data is
limited
Transformer for Partial Differential Equations' Operator Learning
Data-driven learning of partial differential equations' solution operators
has recently emerged as a promising paradigm for approximating the underlying
solutions. The solution operators are usually parameterized by deep learning
models that are built upon problem-specific inductive biases. An example is a
convolutional or a graph neural network that exploits the local grid structure
where functions' values are sampled. The attention mechanism, on the other
hand, provides a flexible way to implicitly exploit the patterns within inputs,
and furthermore, relationship between arbitrary query locations and inputs. In
this work, we present an attention-based framework for data-driven operator
learning, which we term Operator Transformer (OFormer). Our framework is built
upon self-attention, cross-attention, and a set of point-wise multilayer
perceptrons (MLPs), and thus it makes few assumptions on the sampling pattern
of the input function or query locations. We show that the proposed framework
is competitive on standard benchmark problems and can flexibly be adapted to
randomly sampled input